Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSON schema #1426

Open
wants to merge 11 commits into
base: dev
Choose a base branch
from
Open

Add JSON schema #1426

wants to merge 11 commits into from

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Apr 18, 2023

TODO:

  • Define particlePatches
  • Fill in things like "description", "type", "title", ...
  • Maybe add support for abbreviated JSON representations Not now
  • Dataset layout

I have added a CLI tool openpmd-convert_json_toml, exposing internally existing functionality to convert between TOML and JSON. I think it's better to define and maintain the schema in terms of TOML (it can be commented and is more legible), and just compile it to JSON for verification.

x-ref #1436

Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thank you!

Wondering a bit if validation of an XML backend would be easier.... not sure.

.github/workflows/linux.yml Show resolved Hide resolved
Comment on lines +259 to +279
# We need to exclude the thetaMode example since that has a different
# meshesPath and the JSON schema needs to hardcode that.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we able to patch this in check.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very easily. The JSON schema is on the file system and the single .json files refer to each other by their file names. Changing this would require (1) traversing the entire JSON schema and overriding the meshes path, the particles path and the references and (2) somehow setting up python-jsonschema to cross-reference in-memory schemas which I don't even know if it supports that, both at runtime of check.py.

@franzpoeschel
Copy link
Contributor Author

franzpoeschel commented Aug 17, 2023

Wondering a bit if validation of an XML backend would be easier.... not sure.

The major issue of the JSON schema approach is that it is not a predictive parser (https://en.wikipedia.org/wiki/Recursive_descent_parser), meaning that it deals with {"anyOf": [<option1>, <option2>, ...]}-like statements very naively. If there is an error somewhere deep down the hierarchy, the JSON schema just sees that all options have failed and cannot give any more details where the failure was.

If we want to seriously consider XML as an alternative, my opinion is that it must at least solve this particular issue .. which I doubt it can properly, since this involves a bit of parser theory and grammar transformation.

This job is better done by handwritten parsers, which the openPMD-api and the openPMD-validator are.

However, if we add an XML backend at some point, an XML schema is definitely at least worth a consideration.

Written as .toml files for ease of documentation, maintailability and
readability.
Needed for "compiling" the schema to JSON
Also add a Makefile to further simplify this
The JSON schema verification package does not like that
Both of the form "data not found in places where data was expected"
Verify all JSON-openPMD files written by testing against the schema
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants